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ABSTRACT 

Nonlinear dimensionality reduction (NLDR) algorithms such 
as Isomap, LLE and Laplacian Eigenmaps address the prob- 
lem of representing high-dimensional nonlinear data in terms 
of low-dimensional coordinates which represent the intrinsic 
structure of the data. This paradigm incorporates the as- 
sumption that real- valued coordinates provide a rich enough 
class of functions to represent the data faithfully and ef- 
ficiently. On the other hand, there are simple structures 
which challenge this assumption: the circle, for example, is 
one- dimensional but its faithful representation requires two 
real coordinates. In this work, we present a strategy for 
constructing circle- valued functions on a statistical data set. 
We develop a machinery of persistent cohomology to iden- 
tify candidates for significant circle-structures in the data, 
and we use harmonic smoothing and integration to obtain 
the circle- valued coordinate functions themselves. We sug- 
gest that this enriched class of coordinate functions permits 
a precise NLDR analysis of a broader range of realistic data 
sets. 

Categories and Subject Descriptors 

G.3 [Probability and statistics]: multivariate statistics; 
1.5.1 [Pattern recognition]: models — geometric 

General Terms 

algorithms, theory 
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1. INTRODUCTION 

Nonlinear dimensionality reduction (nldr) algorithms ad- 
dress the following problem: given a high-dimensional col- 
lection of data points X C M^, find a low-dimensional em- 
bedding (j) \ X ^ W (for some n <^ N) which faithfully 
preserves the 'intrinsic' structure of the data. For instance, 
if the data have been obtained by sampling from some un- 
known manifold M C M'^ — perhaps the parameter space 
of some physical system — then (j) might correspond to an 
n-dimensional coordinate system on M. If M is completely 
and non-redundantly parametrized by these n coordinates, 
then the nldr is regarded as having succeeded completely. 

Principal components analysis, or linear regression, is the 
simplest form of dimensionality reduction; the embedding 
function is taken to be a linear projection. This is closely 
related to (and sometimes identifed with) classical multidi- 
mensional scaling [2 . 

When there are no satisfactory linear projections, it be- 
comes necessary to use nldr. Prominent algorithms for 
NLDR include Locally Linear Embedding [Q], Isomap [16], 
Laplacian Eigenmaps [1 , Hessian Eigenmaps [5], and many 
more. 

These techniques share an implicit assumption that the 
unknown manifold M is well-described by a finite set of co- 
ordinate functions ^i, 02, • • • , 0n : M ^ R. Explicitly, some 
of the correctness theorems in these studies depend on the 
hypothesis that M has the topological structure of a con- 
vex domain in some R"^. This hypothesis guarantees that 
good coordinates exist, and shifts the burden of proof onto 
showing that the algorithm recovers these coordinates. 

In this paper we ask what happens when this assumption 
fails. The simplest space which challenges the assumption is 
the circle, which is one-dimensional but requires two real co- 
ordinates for a faithful embedding. Other simple examples 
include the annulus, the torus, the figure eight, the 2-sphere, 
the last three of which present topological obstructions to 
being embedded in the Euclidean space of their natural di- 
mension. We propose that an appropriate response to the 
problem is to enlarge the class of coordinate functions to 
include circle- valued coordinates : M ^ . In a physical 
setting, circular coordinates occur naturally as angular and 
phase variables. Spaces like the annulus and the torus are 
well described by a combination of real and circular coor- 
dinates. (The 2-sphere is not so lucky, and must await its 
day.) 

The goal of this paper is to describe a natural procedure 
for constructing circular coordinates on a nonlinear data set 
using techniques from classical algebraic topology and its 



21st-century grandchild, persistent topology. We direct the 
reader to [9] as a general reference for algebraic topology, and 
to [T7] for a streamlined account of persistent homology. 

1.1 Related work 

There have been other attempts to address the problem 
of finding good coordinate representations of simple non- 
Euclidean data spaces. One approach [13] is to use modified 
versions of multidimensional scaling specifically devised to 
find the best embedding of a data set into the cylinder, the 
sphere and so on. The target space has to be chosen in ad- 
vance. Another class of approaches [10] H] involves cutting 
the data manifold along arcs and curves until it has trivial 
topology. The resulting configuration can then be embedded 
in Euclidean space in the usual way. In our approach, the 
number of circular coordinates is not fixed in advance, but is 
determined experimentally after a persistent homology cal- 
culation. Moreover, there is no cutting involved; the coor- 
dinate functions respect the original topology of the data. 

1.2 Overview 

The principle behind our algorithm is the following equa- 
tion from homotopy theory, valid for topological spaces X 
with the homotopy type of a cell complex (which covers ev- 
erything we normally encounter): 

[X,S^] = }i\X-Z) (1) 

The left-hand side denotes the set of equivalence classes of 
continuous maps from X to the circle S^] two maps are 
equivalent if they are homotopic (meaning that one map 
can be deformed continuously into the other); the right-hand 
side denotes the 1-dimensional cohomology of X, taken with 
integer coefficients. In other language: is the classifying 
space for H^, or equivalently is the Eilenberg-MacLane 
space K(Z, 1). See section 4.3 of 0. 

If X is a contractible space (such as a convex subset of R^) , 
then H^(X;Z) = and Equation ([T]) tells us not to bother 
looking for circular functions: all such functions are homo- 
topic to a constant function. On the other hand, if X has 
nontrivial topology then there may well exist a nonzero coho- 
mology class [a] G H^(X; Z); we can then build a continuous 
function X ^ which in some sense reveals [a\. 

Our strategy divides into the following steps. 

1. Represent the given discrete data set as a simplicial 
complex or filtered simplicial complex. 

2. Use persistent cohomology to identify a 'significant' 
cohomology class in the data. For technical reasons, 
we carry this out with coefficients in the field Fp of 
integers modulo p, for some prime p. This gives us 
[ap] GHi(X;Fp). 

3. Lift [ttp] to a cohomology class with integer coefficients: 
[a] G H^(X;Z). 

4. Smoothing: replace the integer cocycle a by a har- 
monic cocycle in the same cohomology class: a G 
C^(X;R). 

5. Integrate the harmonic cocycle o; to a circle- valued 
function : X ^ . 

The paper is organized as follows. In Section lZTI we derive 
what we need of equation ([1]). Steps (1-5) of the algorithm 
are addressed in Sections 12. 2ti2.6[ respectively. In Section [3] 
we report some experimental results. 



2. ALGORITHM DETAILS 

2.1 Cohomology and circular functions 

Let X be a finite simplicial complex. Let X^ ^X^ ^X'^ de- 
note the sets of vertices, edges and triangles of X, respec- 
tively. We suppose that the vertices are totally ordered (in 
an arbitrary way). If a < 6 then the edge between vertices 
a, 6 is always written ab and not ha. Similarly, ii a < h < c 
then the triangle with vertices a, 6, c is always written abc. 

Cohomology can be defined as follows. Let A be a commu- 
tative ring (for example A = Z, Fp, R). We define 0-cochains, 
1-cochains, and 2-cochains as follows: 

C° = C°(X; A) = {functions / : X° ^ A} 

= C^(X; A) = {functions a : X^ ^ K] 

= C^(X; A) = {functions A : X'^ K] 

These are modules over A. We now define coboundary maps 
do : C° ^ and di : ^ 

{d^f){ah) = f{b)-f{a) 
{dia){abc) — a{bc) — a{ac) + a{ab) 

Let a G C"*^. If dia = we say that a is a cocycle. If dof — 
a admits a solution / G C° we say that a is a coboundary. 
The solution /, if it exists, can be thought of as the discrete 
integral of a. It is unique up to adding constants on each 
connected component of X. 

It is easily verified that didof = for any / G C°. Thus, 
coboundaries are always cocycles, or equivalently Im(do) ^ 
Ker((ii). We can measure the difference between cobound- 
aries and cocycles by defining the 1-cohomology of X to be 
the quotient module 

H^(X;A) = Ker((ii)/Im(do). 

We say that two cocycles a, f3 are cohomologous \i a — (3 is 
a coboundary. 

We now consider integer coefficients. The following propo- 
sition fulfils part of the promise of equation ([1]), by produc- 
ing circle- valued functions from integer cocycles. It will be 
helpful to think of as the quotient group R/Z. 

Proposition 1. Let a G C^(X;Z) be a cocycle. Then 
there exists a continuous function ^ : X ^ R/Z which maps 
each vertex to 0, and each edge ab around the entire circle 
with winding number a{ab). 

Proof. We can define inductively on the vertices, edges, 
triangles, ... of X. The vertices and edges follow the pre- 
scription in the statement of the proposition. To extend 
to the triangles, it is necessary that the winding number of 
along the boundary of each triangle abc is zero. And in- 
deed this is a(bc) — a{ac) -\-a{ab) = dia{abc) — 0. Since the 
higher homotopy groups of are all zero ([9], section 4.3), 
can then be extended to the higher cells of X without 
obstruction. □ 

The construction in Proposition [T] is unsatisfactory in the 
sense that all vertices are mapped to the same point. All 
variation in the circle parameter takes place in the interior 
of the edges (and higher cells). This is rather unsmooth. 
For more leeway, we consider real coefficients. 

Proposition 2. Let a G C^(X;R) be a cocycle. Suppose 
we can find a G C^{X]X) and f G C°(X;R) such that a — 



a -\- dof. Then there exists a continuous function : X ^ 
R/Z which maps each edge ab linearly to an interval of length 
a{ab), measured with sign. 

In other words, we can construct a circle-valued function 
out of any real co cycle a whose cohomology class [a] lies 
in the image of the natural homomorphism H^(X;Z) 
H^(X;R). 

Proof. Define on the vertices of X by setting 0(a) to 
be /(a) mod Z. For each edge ab, we have 

0(b) -0(a) = f(b)-f(a) 

— dof(ab) 

— a(ab) — a(ab) 

which is congruent to a(ab) mod Z, since a(ab) is an integer. 

It follows that can be taken to map ab linearly onto an 
interval of signed length a(ab). Since a is a cocyle, can 
be extended to the triangles as before; then to the higher 
cells. □ 

Proposition [2] suggests the following tactic: from an in- 
teger cocycle a we construct a cohomologous real cocycle 
a — a-\-dof, and then define — f mod Z on the vertices 
of X. If we can construct a so that the edge- lengths |Q;(a6)| 
are small, then the behaviour of will be apparent from its 
restriction to the vertices. See Section [2.51 

2.2 Point-cloud data to simplicial complex 

We now begin describing the workflow in detail. The input 
is a point-cloud data set: in other words, a finite set S C 
or more generally a finite metric space. The first step is to 
convert S into a simplicial complex and to identify a stable- 
looking integer cohomology class. This will occupy the next 
three subsections. 

The first lesson of point-cloud topology |7j is that point- 
clouds are best represented by 1-parameter nested families of 
simplicial complexes. There are several candidate construc- 
tions: the Vietoris-Rips complex X^ — Rips(5', e) has vertex 
set S and includes a A:-simplex whenever all A; + 1 vertices lie 
pairwise within distance e of each other. The witness com- 
plex X^ — Witness(L, e) uses a smaller vertex set L C S 
and includes a /c-simplex when the A: + 1 vertices lie close 
to other points of S, in a certain precise sense (see [3] [8]). 
In both cases, X^ C X^ whenever e < e . Either of these 
constructions will serve our purposes, but the witness com- 
plex has the computational advantage of being considerably 
smaller. 

We determine X^ only up to its 2-skeleton, since we are 
interested in H^. 

2.3 Persistent cohomology 

Having constructed a 1-parameter family {^^}, we ap- 
ply the principle of persistence to identify cocycles that are 
stable across a large range for e. Suppose that ei, £2, • • • , Cm 
are the critical values where the complex X^ gains new cells. 
The family can be represented as a diagram 

X'^ — >X'^ — > ... — > X'^ 

of simplicial complexes and inclusion maps. For any coeffi- 
cient field F, the cohomology functor H^( — ;F) converts this 
diagram into a diagram of vector spaces and linear maps 
over F; the arrows are reversed: 

ll\X'^;¥) < ll\X'^;¥) < ...< ll\X'^;¥) 



According to the theory of persistence [6l [T7], such a di- 
agram decomposes as a direct sum of 1-dimensional terms 
indexed by half-open intervals of the form [e^, ej). Each such 
term corresponds to a cochain a G C*(X^) that satisfies the 
cocycle condition for e < ej and becomes a coboundary for 
e < a. The collection of intervals can be displayed graphi- 
cally as a persistence diagram, by representing each interval 
[ci^Cj) as a point (ci^ej) in the Cartesian plane above the 
main diagonal. We think of long intervals as representing 
trustworthy (i.e. stable) topological information. 

Choice of coefficients. The persistence decomposi- 
tion theorem applies to diagrams of vector spaces over a 
field. When we work over the ring of integers Z, however, 
the result is known to fail: there need not be an interval 
decomposition. This is unfortunate, since we require integer 
cocycles to construct circle maps. To finesse this problem, 
we pick an arbitrary prime number p (such as p = 47) and 
carry out our persistence calculations over the finite field 
F = Fp. The resulting ¥p cocyle must then be converted to 
integer coefficients: we address this in Section [2.41 

In principle we can use the ideas in [TT to calculate the 
persistent cohomology intervals and then select a long inter- 
val [ci^Cj) and a specific 6 G [ci^ej). We then let X — X^ 
and take a to be the cocycle in C^(X;¥) corresponding to 
the interval. 

Explicitly, persistent cocycles can be calculated in the fol- 
lowing way. We thank Dmitriy Morozov for this algorithm. 
Suppose that the simplices in the filtered complex are to- 
tally ordered, and labelled ai, (72, . . . , am so that ai arrives 
at time a. For k = 0,l,...,m we maintain the following 
information: 

• a set of indices Ik ^ {1,2, associated with 'live' 
cocycles; 

• a list of cocycles (ai : i e h) in C*(X^'«;F). 

The cocycle ai involves only ai and those simplices of the 
same dimension that appear later in the filtration sequence 
(thus only aj with j > i). 

Initially /o = and the list of cocycles is empty. 

To update from /c — 1 to /c, we compute the cobound- 
aries of the cocycles (ai : i G Ik-i) of X^'^-^ within the 
larger complex X^'^ obtained by including the simplex ak- 
in fact, these coboundaries must be multiples of the elemen- 
tary cocycle a = [ak] defined by a(ak) — 1, and a(aj) — 
otherwise. We can write dai = c^cr/c]. If all the Ci are zero, 
then we have one new cocycle: let Ik — //c-iU{A:} and define 
o^k — [ak\- Otherwise, we must lose a cocycle. Let j G Ik-i 
be the largest index for which Cj / 0. We delete aj by set- 
ting Ik — Ik-i \ and we restore the earlier cocycles by 
setting ai ^ ai — (ci/cj)aj. In this latter case, we write the 
persistence interval [ej^Ck) to the output. 

At the end of the process, surviving cocycles are associated 
with semi-infinite intervals: [e^, oo) for i G /m- 

Remark. The reader may be more familiar with persis- 
tence diagrams in homology rather than cohomology. In 
fact, the universal coefficient theorem [9 implies that the 
two diagrams are identical. The salient point is that coho- 
mology is the vector-space dual of homology, when work- 
ing with field coefficients. That said, we cannot simply use 
the usual algorithm for persistent homology: we are inter- 
ested in obtaining explicit cocycles, whereas the classical 
algorithm [17] returns cycles. 



We will establish the correctness of this algorithm in the 
archival version of this paper. The expert reader may regard 
this as an exercise in the theory of persistence. 

2.4 Lifting to integer coefficients 

We now have a simplicial complex X = and a cocycle 
ap G C'^{X;¥p). The next step is to 'lift' ap by constructing 
an integer cocycle a which reduces to ap modulo p. 

To show that this is (almost) always possible, note that 
the short exact sequence of coefficient rings — > Z 
Z — > ¥p — > gives rise to a long exact sequence, called 
the Bockstein sequence (see Section 3.E of [9]). Here is the 
relevant section of the sequence: 

^ H^(X;Z) ^ ll\X;¥p) 4 H^(X;Z) ^ H^(X;Z) ^ 

By exactness, the Bockstein homomorphism P induces an 
isomorphism between the cokernel of (X; Z) ^ (X; ¥p) 
and the kernel of H^(X;Z) ^ H^(X;Z), and this kernel is 
precisely the set of p-torsion elements of (X; Z) . If there is 
no p- torsion, then it follows immediately that the cokernel of 
the first map is zero. In other words H^(X;Z) ^ ll\X;¥p) 
is surjective; any cocycle ap G C^(X;Fp) can be lifted to a 
cocycle aeC^{X;Z). 

If we are unluckily sabotaged by p- torsion, then we pick 
another prime and redo the calculation from scratch: it is 
enough to pick a prime that does not divide the order of the 
torsion subgroup of H^(X;Z), so almost any prime will do. 

In practice, we construct a by taking the coefficients of 
ap in ¥p and replacing them with integers in the correct 
congruence class modulo p. The default choice is to choose 
coefficients close to zero. If dia = then we are done; oth- 
erwise it becomes necessary to do some repair work. Cer- 
tainly dia = modulo p, so we can write dia = pr] for 
some rj G C^{X; Z). In the absence of p-torsion, we can then 
solve rj = diC for ^ G C^(X; Z), and then the required lift is 
a —p(^. Fortunately, this has not proved necessary in any of 
our examples. 

Remark. We expect that p-torsion is extremely rare in 
'real' data sets, since it is symptomatic of rather subtle 
topological phenomena. For instance, the simplest examples 
which exhibit 2-torsion are the nonorientable closed surfaces 
(such as the projective plane and the Klein bottle). 

2.5 Harmonic smoothing 

Given an integer cocycle a G C^(X;Z), or indeed a real 
cocycle a G C^(X;R), we wish to find the 'smoothest' real 
cocycle a G C^(X;R) cohomologous to a. It turns out that 
what we want is the harmonic cocycle representing the co- 
homology class [a]. 

We define smoothness. Each of the spaces C*(X; R) comes 
with a natural Euclidean metric: 

ll/f = E 

ab exi 

\\A\\' = Yl \Mabc)\\ 

abc eX2 

A circle- valued function is 'smooth' if its total variation 
across the edges of X is small. The terms |a(a6)|^ cap- 



ture the variation across individual edges; therefore what 
we must minimize is ||q:||^. 

Proposition 3. Let a G C^(X;R). There IS a unique 
solution a to the least-squares minimization problem 

argmin{||af | 3/gC°(X;R), a^a + dof}. (2) 

a 

Moreover, a is characterized by the equation doa = 0, where 
do is the adjoint of do with respect to the inner products on 
C°,C^ 

Proof. Note that dga = then for any / G C° we 
have 

||a + do/||' = ||a||' + 2(a,do/> + |Mo/||' 

= ||a||^ + 2(dSQ,/> + ||do/||' 

= \\af + \\doff 

which implies that such an a must be the unique minimizer. 
For existence, note that 

do a -\- do dof = 

certainly has a solution / if Im(dS) = lm{do do). But this is a 
standard fact in finite- dimensional linear algebra: Im(A^) = 
Im(A^A) for any real matrix A; this follows from the singular 
value decomposition, for instance. □ 

Remark. It is customary to construct the Laplacian A = 
dl di -\- do do- The twin equations dia — and do a = 
immediately imply (and conversely, can be deduced from) 
the single equation Aa = 0; in other words a is harmonic. 

2.6 Integration 

The least-squares problem in equation (|2| can be solved 
using a standard algorithm such as LSQR [12j. By Propo- 
sition [2] we can use the solution parameter / to define the 
circular coordinate on the vertices of X. This works be- 
cause the original cocycle a has integer coefficients. 

More generally, if a is an arbitrary real cocycle such that 
[a] G Im(H^(X;Z) H^(X;R)), it is a straightforward 
matter to integrate a to a circle-valued function on the 
vertex set X^ . Suppose that X is connected (if not, each 
connected component can be treated separately) and pick a 
starting vertex xo and assign 0{xo) — 0. One can use Di- 
jkstra's algorithm to find shortest paths to each remaining 
vertex from xo- When a new vertex b enters the structure 
via an edge ab, we assign 0{b) — 0{a)-\-a{ab) (or 0(a) — a{ba) 
if the edge is correctly identified as ba). If a vertex a is con- 
nected to Xo by multiple paths then the different possible 
values of 0(a) differ by an integer; this is where we use the 
hypothesis that a is cohomologous to an integer cocyle. 

3. EXPERIMENTS 
3.1 Software 

The following experiments were carried out using the Java- 
based jPlex simplicial complex software p^, with high-level 
scripting and numerical analysis in MATLAB. We ran a de- 
velopment version of jPlex to obtain explicit persistent co- 
homology cocycles. We expect to include the code in the 
next release of jPlex. We used Paige and Saunders' imple- 
mentation of LSQR [11] for the least-squares problem in the 
harmonic smoothing step. Timings were determined using 
MATLAB 's built-in 'tic' and 'toe' commands, and are in- 
cluded for relative comparison against each other. 



3.2 General procedure 

We tested our methods on several synthetic data sets with 
known topology, ranging from the humble circle itself to a 
genus-2 surface ('double torus'). Most of the examples were 
embedded in or R^, with the exception of a sample from 
a complex projective curve (embedded in CP^) and a syn- 
thetic image-like data set (embedded in R^^oooo^^ 

In each case we selected vertices for the filtered simplicial 
complex: either the whole set, or a smaller well-distributed 
subset of 'landmarks' selected by iterative furthest-point 
sampling. We then built a Rips or witness complex, with 
maximum radius generally chosen to ensure around 10^ sim- 
plices in the complex. 

In most cases, we show the persistence diagram produced 
by the cocycle computation. The chosen value 6 is marked 
on the diagonal, with its upper-left quadrant indicated in 
green lines. The persistent cocycles available at that pa- 
rameter value are precisely those contained in that quadrant. 
Each of those cocycles then produces a circular coordinate. 

There are various figures associated with each example. 
Most important are the correlation scatter plots: each scat- 
ter plot compares two circular coordinate functions. These 
may be functions produced by the computation ('inferred 
coordinates') or known parameters. These scatter plots are 
drawn in the unit square, which is of course really a torus 
X S\ 

When the original data are embedded in R^ or R^, we 
also display the circular coordinates directly on the data 
set, plotting each point in color according to its coordinate 
value interpreted on the standard hue-circle. This works less 
well in grayscale reproductions, of course. 

Finally, in certain cases we plot coordinate values against 
frequency, as a histogram. This distributional information 
can sometimes be useful in the absence of other information. 

Remark. When the goal is to infer the topology of a data 
set whose structure is unknown, we do not have any 'known 
parameters' available to us. We can still construct correla- 
tion scatter plots between pairs of inferred coordinates, and 
the distributional histograms for each coordinate individu- 
ally. We exhort the reader to view the following examples 
through the lens of the topological inference problem: what 
structures can be distinguished using scatter plots and his- 
tograms (and persistence diagrams) alone? 

3.3 Noisy circle 

We begin with the circle itself, and its tautological circle- 
valued coordinate. 

We picked 400 points distributed along the unit circle. 
We added a uniform random variable from [0.0, 0.4] to each 
coordinate. A Rips complex was constructed with maximal 
radius 0.5, resulting in 23475 simplices. The computation of 
cohomology finished in 237 seconds. 

Parametrizing at 0.4 yielded a single coordinate function, 
which very closely reproduces the tautological angle func- 
tion. Parametrizing at 0.14 yielded several possible cocycles. 
We selected one of those with low persistence; this produced 
a parametrization which 'snags' around a small gap in the 
data. 

See Figure [1] The left panel in each row shows the his- 
togram of coordinate values; the middle panel shows the cor- 
relation scatter plot against the known angle function; the 
right panel displays the coordinate using color. The high- 



persistence ('global') coordinate correlates with the angle 
function with topological degree 1. Variation in that coor- 
dinate is uniformly distributed, as seen in the histogram. In 
contrast, the low-persistence ('local') coordinate has a spiky 
distribution. 

3.4 Trefoil torus knot 

Another example with circle topology: see Figure [2] We 
picked 400 points distributed along the (2, 3) torus knot on 
a torus with radii 2.0 and 1.0. We jittered them by a uni- 
form random variable from [0.0, 0.2] added to each coordi- 
nate. We generated a Rips complex up to radius 1.0, acquir- 
ing 36936 simplices. We computed persistent cohomology 
in 70 seconds. As expected, the inferred coordinate corre- 
lates strongly with the known parameter with topological 
degree 1. The histogram shows three 'bulges' correspond- 
ing to the three high- density regions of the sampled curve, 
which occur when the curve approaches the central axis of 
the torus. 

3.5 Rotating cube 

For a more elaborate data set with 5'^-topology, we gen- 
erated a sequence of 657 rendered images of a colorful cube 
rotating around one axis. Each image was regarded as a 
vector in the Euclidean space ]^2oo-2oo-3 p^om this data 
we built a witness complex with 50 landmark points and 
constructed a single circular coordinate. Interpolating the 
resulting function linearly between the landmarks gave us 
coordinates for all the points in the family. 

See Figure [3l The frequency distribution is comparatively 
smooth (by which we mean that there are no large spikes in 
the histogram), which indicates that the coordinate does not 
have large static regions. The correlation plot of the inferred 
coordinate against the original known sequence of the cube 
images shows a correlation with topological degree 1. We 
show the progression of the animation on an evenly-spaced 
sample of representative points around the circle. 

3.6 Pair of circles 

See Figure U for these two examples. 

Conjoined circles: we picked 400 points distributed along 
circles in the plane with radius 1 and with centres at (±1,0). 
The points were then jittered by adding noise to each coor- 
dinate taken uniformly randomly from the interval [0.0, 0.3]. 
A Rips complex was constructed with maximal radius 0.5, 
resulting in 76763 simplices. The cohomology was computed 
in 378 seconds. 

Disjoint circles: 400 points were distributed on circles of 
radius 1 centered around (±2, 0) in the plane. These points 
were subsequently disturbed by a uniform random variable 
from [0.0,0.5]. We constructed a Rips complex with maxi- 
mum radius 0.5, which gave us 45809 simplices. The coho- 
mology computation finished in about 117 seconds. 

In both cases, our method detects the two most natural 
circle- valued functions. The scatter plots appear very simi- 
lar. In the conjoined case, there is some interference between 
the two circles, near their meeting point. 

3.7 Torus 

See Figure \S\ We picked 400 points at random in the 
unit square, and then used a standard parametrization to 
map the points onto a torus with inner and outer radii 1.0 
and 3.0. These were subsequently jittered by adding a uni- 



Figure 1: Noisy circle. Persistence diagram (left). Global coordinate (top row), local coordinate (bottom 
row). In each row: histogram of coordinate values (left), correlation scatter plot against known angle function 
(middle), inferred coordinate in color (right). 




Figure 3: Images of a rotating cube. Histogram of coordinate values (left); scatter plot against known angle 
function (middle); a selection of images matched to recovered circle coordinate (right). 




Figure 4: Two conjoined circles (left); two disjoint circles (right). In each case we show the persistence 
diagram (top left), the two inferred coordinates (right column), the correlation scatter plot (bottom left). 



form random variable from [0.0, 0.2] to each coordinate. We 
constructed a Rips complex with maximal radius >/3, result- 
ing in 61522 simplices. The corresponding cohomology was 
computed in 209 seconds. 

The two inferred coordinates in this (fairly typical) ex- 
perimental run recover the original coordinates essentially 
perfectly: the first inferred coordinate correlates with the 
meridional coordinate with topological degree —1, while the 
second inferred coordinate correlates with the longitudinal 
coordinate with degree 1. 

When the original coordinates are unavailable, the impor- 
tant figure is the inferred- versus-inferred scatter plot. In 
this case the scatter plot is fairly uniformly distributed over 
the entire coordinate square (i.e. torus). In other words, 
the two coordinates are decorrelated. This is slightly truer 
(and more clearly apparent in the scatter plot) for the two 
original coordinates. Contrast these with the corresponding 
scatter plots for a pair of circles (conjoined or disjoint). 

3.8 Elliptic curve 

See Figure (6] For fun, we repeated the previous experi- 
ment with a torus abstractly defined as the zero set of a ho- 
mogeneous cubic polynomial in three variables, interpreted 
as a complex projective curve. We picked 400 points at ran- 
dom on 5*^ C C^, subject to the cubic equation 

x^y + z + z^x — 0. 

To interpret these as points in CP^, we used the projectively 
invariant metric 

d{i,r,) = cos-\\i ■ n\) 

for all pairs ^,77 G . With this metric we built a Rips 
complex with maximal radius 0.15. The resulting complex 
had 44184 simplices, and the cohomology was computed in 
56 seconds. We found two dominant coclasses that survived 
beyond radius 0.15, and we computed our parametrizations 
at the 0.15 mark. 

The resulting correlation plot quite clearly exhibits the 
decorrelation which is characteristic of the torus. 




Figure 6: Elliptic curve. Persistence diagram (left), 
correlation scatter plot between the two coordinates 
(right). 



3.9 Double torus 

See Figure [T] We constructed a genus- 2 surface by gen- 
erating 1600 points on a torus with inner and outer radii 
1.0 and 3.0; slicing off part of the data set by a plane at 
distance 3.7 from the axis of the torus, and reflecting the 
remaining points in that plane. The resulting data set has 
3120 points. Out of these, we pick 400 landmark points, and 
construct a witness complex with maximal radius 0.6. The 
landmark set yields a covering radius rmax = 0.9982 and a 
complex with 70605 simplices. The computation took 748 
seconds active computer time. We identified the four most 
significant cocycles. 

Note that coordinates 1 and 4 are 'coupled' in the sense 
that they are supported over the same subtorus of the dou- 
ble torus. The scatter plot shows that the two coordinates 
appear to be completely decorrelated except for a large mass 
concentrated at a single point. This mass corresponds to the 
other subtorus, on which coordinates 1 and 4 are essentially 
constant. A similar discussion holds for coordinates 2 and 3. 

The uncoupled coordinate pairs (1,2), (1,3), (2,4), (3,4) 
produce scatter plots reminiscent of two conjoined or disjoint 
circles. 




(b) Correlation scatter plots between the two original 
Figure 5: Torus in R^. 



and two inferred coordinates. 
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